AITopics | surrogate loss function

Collaborating Authors

surrogate loss function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NE: Surrogate-Assisted Federated Neighbor Embedding for Dimensionality Reduction

Neural Information Processing SystemsFeb-18-2026, 16:22:34 GMT

Despite its broad applications in fields such as computer vision, graph learning, and natural language processing, the development of a data projection model that can be effectively used to visualize data in the context of FL is crucial yet remains heavily under-explored. Neighbor embedding (NE) is an essential technique for visualizing complex high-dimensional data, but collab-oratively learning a joint NE model is difficult.

data mining, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Virginia (0.04)
North America > United States > Ohio (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback

c5d215777c229704a7862de577d40a73-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 02:28:12 GMT

estimator, learning, multi-label learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Online Active Learning with Surrogate Loss Functions

Neural Information Processing SystemsDec-24-2025, 20:49:04 GMT

We derive a novel active learning algorithm in the streaming setting for binary classification tasks. The algorithm leverages weak labels to minimize the number of label requests, and trains a model to optimize a surrogate loss on a resulting set of labeled and weak-labeled points. Our algorithm jointly admits two crucial properties: theoretical guarantees in the general agnostic setting and a strong empirical performance. Our theoretical analysis shows that the algorithm attains favorable generalization and label complexity bounds, while our empirical study on 18 real-world datasets demonstrate that the algorithm outperforms standard baselines, including the Margin Algorithm, or Uncertainty Sampling, a high-performing active learning algorithm favored by practitioners.

active learning, name change, online active learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Bayes Consistency vs. H-Consistency: The Interplay between Surrogate Loss Functions and the Scoring Function Class

Neural Information Processing SystemsDec-24-2025, 13:58:06 GMT

A fundamental question in multiclass classification concerns understanding the consistency properties of surrogate risk minimization algorithms, which minimize a (often convex) surrogate to the multiclass 0-1 loss. In particular, the framework of calibrated surrogates has played an important role in analyzing the Bayes consistency properties of such algorithms, i.e. in studying convergence to a Bayes optimal classifier (Zhang, 2004; Tewari and Bartlett, 2007). However, follow-up work has suggested this framework can be of limited value when studying H-consistency; in particular, concerns have been raised that even when the data comes from an underlying linear model, minimizing certain convex calibrated surrogates over linear scoring functions fails to recover the true model (Long and Servedio, 2013). In this paper, we investigate this apparent conundrum. We find that while some calibrated surrogates can indeed fail to provide H-consistency when minimized over a natural-looking but naively chosen scoring function class F, the situation can potentially be remedied by minimizing them over a more carefully chosen class of scoring functions F. In particular, for the popular one-vs-all hinge and logistic surrogates, both of which are calibrated (and therefore provide Bayes consistency) under realizable models, but were previously shown to pose problems for realizable H-consistency, we derive a form of scoring function class F that enables H-consistency. When H is the class of linear models, the class F consists of certain piecewise linear scoring functions that are characterized by the same number of parameters as in the linear case, and minimization over which can be performed using an adaptation of the min-pooling idea from neural network training. Our experiments confirm that the one-vs-all surrogates, when trained over this class of scoring functions F, yield better multiclass classifiers than when trained over standard linear scoring functions.

h-consistency, name change, surrogate loss function, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.58)

Add feedback

Adversarial Surrogate Losses for Ordinal Regression

Rizal Fathony, Mohammad Ali Bashiri, Brian Ziebart

Neural Information Processing SystemsNov-21-2025, 12:52:45 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, bayesian inference, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

SolarBoost: Distributed Photovoltaic Power Forecasting Amid Time-varying Grid Capacity

Geng, Linyuan, Yang, Linxiao, Gu, Xinyue, Sun, Liang

arXiv.org Artificial IntelligenceOct-27-2025

This paper presents SolarBoost, a novel approach for forecasting power output in distributed photovoltaic (DPV) systems. While existing centralized photovoltaic (CPV) methods are able to precisely model output dependencies due to uniformity, it is difficult to apply such techniques to DPV systems, as DPVs face challenges such as missing grid-level data, temporal shifts in installed capacity, geographic variability, and panel diversity. SolarBoost overcomes these challenges by modeling aggregated power output as a composite of output from small grids, where each grid output is modeled using a unit output function multiplied by its capacity. This approach decouples the homogeneous unit output function from dynamic capacity for accurate prediction. Efficient algorithms over an upper-bound approximation are proposed to overcome computational bottlenecks in loss functions. We demonstrate the superiority of grid-level modeling via theoretical analysis and experiments. SolarBoost has been validated through deployment across various cities in China, significantly reducing potential losses and provides valuable insights for the operation of power grids. The code for this work is available at https://github.com/DAMO-DI-ML/SolarBoost.

data mining, machine learning, solarboost, (20 more...)

arXiv.org Artificial Intelligence

2510.21129

Country: Asia > China (0.67)

Genre: Research Report > Promising Solution (0.48)

Industry: Energy > Renewable > Solar (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

NE: Surrogate-Assisted Federated Neighbor Embedding for Dimensionality Reduction

Neural Information Processing SystemsOct-10-2025, 21:16:37 GMT

dataset, experiment, neighbor, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Virginia (0.04)
North America > United States > Ohio (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback

Unsupervised Learning for the Elementary Shortest Path Problem

Chen, Jingyi, Zhang, Xinyuan, Qian, Xinwu

arXiv.org Artificial IntelligenceAug-5-2025

The Elementary Shortest-Path Problem(ESPP) seeks a minimum cost path from s to t that visits each vertex at most once. The presence of negative-cost cycles renders the problem NP-hard. We present a probabilistic method for finding near-optimal ESPP, enabled by an unsupervised graph neural network that jointly learns node value estimates and edge-selection probabilities via a surrogate loss function. The loss provides a high probability certificate of finding near-optimal ESPP solutions by simultaneously reducing negative-cost cycles and embedding the desired algorithmic alignment. At inference time, a decoding algorithm transforms the learned edge probabilities into an elementary path. Experiments on graphs of up to 100 nodes show that the proposed method surpasses both unsupervised baselines and classical heuristics, while exhibiting high performance in cross-size and cross-topology generalization on unseen synthetic graphs.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2508.01557

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.68)

Add feedback

Online Learning-guided Learning Rate Adaptation via Gradient Alignment

Jiang, Ruichen, Kavis, Ali, Mokhtari, Aryan

arXiv.org Machine LearningJun-11-2025

The performance of an optimizer on large-scale deep learning models depends critically on fine-tuning the learning rate, often requiring an extensive grid search over base learning rates, schedules, and other hyperparameters. In this paper, we propose a principled framework called GALA (Gradient Alignment-based Learning rate Adaptation), which dynamically adjusts the learning rate by tracking the alignment between consecutive gradients and using a local curvature estimate. Guided by the convergence analysis, we formulate the problem of selecting the learning rate as a one-dimensional online learning problem. When paired with an online learning algorithm such as Follow-the-Regularized-Leader, our method produces a flexible, adaptive learning rate schedule that tends to increase when consecutive gradients are aligned and decrease otherwise. We establish a data-adaptive convergence rate for normalized SGD equipped with GALA in the smooth, nonconvex setting. Empirically, common optimizers such as SGD and Adam, when augmented with GALA, demonstrate robust performance across a wide range of initial learning rates and perform competitively without the need for tuning.

artificial intelligence, learning rate, machine learning, (18 more...)

arXiv.org Machine Learning

2506.08419

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Follow-the-Regularized-Leader with Adversarial Constraints

Ferreira, Ricardo N., Soares, Cláudia

arXiv.org Machine LearningMar-17-2025

Constrained Online Convex Optimization (COCO) can be seen as a generalization of the standard Online Convex Optimization (OCO) framework. At each round, a cost function and constraint function are revealed after a learner chooses an action. The goal is to minimize both the regret and cumulative constraint violation (CCV) against an adaptive adversary. We show for the first time that is possible to obtain the optimal $O(\sqrt{T})$ bound on both regret and CCV, improving the best known bounds of $O \left( \sqrt{T} \right)$ and $\~{O} \left( \sqrt{T} \right)$ for the regret and CCV, respectively.

artificial intelligence, constraint function, machine learning, (13 more...)

arXiv.org Machine Learning

2503.13366

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.35)

Add feedback